38 research outputs found

    On the usage of the probability integral transform to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems

    Full text link
    We present a new distributed fuzzy partitioning method to reduce the complexity of multi-way fuzzy decision trees in Big Data classification problems. The proposed algorithm builds a fixed number of fuzzy sets for all variables and adjusts their shape and position to the real distribution of training data. A two-step process is applied : 1) transformation of the original distribution into a standard uniform distribution by means of the probability integral transform. Since the original distribution is generally unknown, the cumulative distribution function is approximated by computing the q-quantiles of the training set; 2) construction of a Ruspini strong fuzzy partition in the transformed attribute space using a fixed number of equally distributed triangular membership functions. Despite the aforementioned transformation, the definition of every fuzzy set in the original space can be recovered by applying the inverse cumulative distribution function (also known as quantile function). The experimental results reveal that the proposed methodology allows the state-of-the-art multi-way fuzzy decision tree (FMDT) induction algorithm to maintain classification accuracy with up to 6 million fewer leaves.Comment: Appeared in 2018 IEEE International Congress on Big Data (BigData Congress). arXiv admin note: text overlap with arXiv:1902.0935

    Aprendizaje de distancias basadas en disimilitudes para el algoritmo de clasificación KNN

    Get PDF
    El objetivo de este proyecto es el de tratar de mejorar el algoritmo KNN (k vecinos más cercanos) sustituyendo la distancia Euclidea clásica por disimilitudes parametrizadas que serán ajustadas utilizando un algoritmo genético. La idea es que el algoritmo genético aprenda diferentes parámetros para luego calcular las distancias entre instancias utilizando esos parámetros, en vez de utilizar otras distancias clásicas como la Euclidea. También consideramos la opción de poder realizar la selección de instancias y de atributos, de esta manera, el algoritmo genético podrá excluir las instancias que sean ruido. Al utilizar esta técnica se acelerara el cálculo de las distancias, ya que al disminuir el número de instancias y de atributos, se requieren menos cálculos a la hora de calcular las distancias. Al final, realizaremos una comparativa con las diversas variantes que se puedan dar y el algoritmo KNN original, para ver si existe mejora a la hora de clasificar.Graduado o Graduada en Ingeniería Informática por la Universidad Pública de NavarraInformatika Ingeniaritzako Graduatua Nafarroako Unibertsitate Publikoa

    Biodiversidad genética de organismos marinos en el Parque Nacional de Cabrera: aplicaciones para la conservación

    Get PDF
    10 Páginas ; 1 Figura ; 2 TablasSe estudió la diversidad genética de las poblaciones de varias especies representativas del bentos marino (esponjas, cnidarios, ascidias, equinodermos y peces) del Parque Nacional de Cabrera. Se utilizaron marcadores moleculares de tasa de mutación alta y evolutivamente neutros (microsatélites) y, en algunos casos, genes mitocondriales. Los muestreos se realizaron dentro del Parque y varias zonas de las islas de Mallorca e Ibiza y a lo largo de la costa peninsular. Nuestros resultados indican que numerosas especies de invertebrados sésiles y algunos peces, que forman una parte esencial de los ecosistemas rocosos del Parque, están genéticamente aislados de las zonas adyacentes. Ello implica que las fases larvarias o adultas de estas especies no provienen mayoritariamente de las zonas próximas al Parque (ni siquiera de la isla de Mallorca) sino del propio Parque. Es decir hay un elevado nivel de autoreclutamiento. Por lo tanto, la desaparición de estas poblaciones en el Parque tendría una lenta recuperación.Peer reviewe

    Diversity, structure and spatial distribution of megabenthic communities in Cap de Creus continental shelf and submarine canyon (NW Mediterranean)

    Get PDF
    The continental shelf and submarine canyon off Cap de Creus (NW Mediterranean) were declared a Site of Community Importance (SCI) within the Natura 2000 Network in 2014. Implementing an effective management plan to preserve its biological diversity and monitor its evolution through time requires a detailed character ization of its benthic ecosystem. Based on 60 underwater video transects performed between 2007 and 2013 (before the declaration of the SCI), we thoroughly describe the composition and structure of the main mega benthic communities dwelling from the shelf down to 400 m depth inside the submarine canyon. We then mapped the spatial distribution of the benthic communities using the Random Forest algorithm, which incor porated geomorphological and oceanographic layers as predictors, as well as the intensity of the bottom-trawling fishing fleet. Although the study area has historically been exposed to commercial fishing practices, it still holds a rich benthic ecosystem with over 165 different invertebrate (morpho)species of the megafauna identified in the video footage, which form up to 9 distinct megabenthic communities. The continental shelf is home to coral gardens of the sea fan Eunicella cavolini, sea pen and soft coral assemblages, dense beds of the crinoid Leptometra phalangium, diverse sponge grounds and massive aggregations of the brittle star Ophiothrix fragilis. The submarine canyon off Cap de Creus is characterized by a cold-water coral community dominated by the scleractinian coral Madrepora oculata, found in association with several invertebrate species including oysters, brachiopods and a variety of sponge species, as well as by a community dominated by cerianthids and sea urchins, mostly in sedimentary areas. The benthic communities identified in the area were then compared with habitats/biocenoses described in reference habitat classification systems that consider circalittoral and bathyal environments of the Mediterranean. The complex environmental setting characteristic of the marine area off Cap de Creus likely produces the optimal conditions for communities dominated by suspension- and filter-feeding species to develop. The uniqueness of this ecosystem and the anthropogenic pressures that it faces should prompt the development of effective management actions to ensure the long-term conservation of the benthic fauna representative of this marine area3,26
    corecore